NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

FaSTCC: Fast Sparse Tensor Contractions on CPUs

https://doi.org/10.1145/3712285.3759841

Raje, Saurabh; McCoy, Hunter; Rountev, Atanas; Pandey, Prashant; Sadayappan, P (November 2025, ACM)

Free, publicly-accessible full text available November 15, 2026
Evaluating Learned Indexes for External Memory Joins

Chesetti, Yuvaraj; Pandey, Prashant (July 2025, SIAM ACDA)

Free, publicly-accessible full text available July 28, 2026
Zombie Hashing: Reanimating Tombstones in a Graveyard

CHESETTI, YUVARAJ; SHI, BENWEI; PHILLIPS, JEFF M; PANDEY, PRASHANT (June 2025, ACM SIGMOD)

Free, publicly-accessible full text available June 20, 2026
Beyond Bloom: A Tutorial on Future Feature-Rich Filters

https://doi.org/10.1145/3626246.3654681

Pandey, Prashant; Farach-Colton, Martín; Dayan, Niv; Zhang, Huanchen (June 2024, ACM)

Full Text Available
Adaptive Quotient Filters

https://doi.org/10.1145/3677128

Wen, Richard; McCoy, Hunter; Tench, David; Tagliavini, Guido; Bender, Michael A; Conway, Alex; Farach-Colton, Martin; Johnson, Rob; Pandey, Prashant (October 2024, Proceedings of the ACM on Management of Data)

Filters trade off accuracy for space and occasionally return false positive matches with a bounded error. Numerous systems use filters in fast memory to avoid performing expensive I/Os to slow storage. A fundamental limitation in traditional filters is that they do not change their representation upon seeing a false positive match. Therefore, the maximum false positive rate is only guaranteed for a single query, not for an arbitrary set of queries. We can improve the filter's performance on a stream of queries, especially on a skewed distribution, if we can adapt after encountering false positives. Adaptive filters, such as telescoping quotient filters and adaptive cuckoo filters, update their representation upon detecting a false positive to avoid repeating the same error in the future. Adaptive filters require an auxiliary structure, typically much larger than the main filter and often residing on slow storage, to facilitate adaptation. However, existing adaptive filters are not practical and have not been adopted in real-world systems for two main reasons. First, they offer weak adaptivity guarantees, meaning that fixing a new false positive can cause a previously fixed false positive to come back. Secondly, the sub-optimal design of the auxiliary structure results in adaptivity overheads so substantial that they can actually diminish overall system performance compared to a traditional filter. In this paper, we design and implement the \sysname, the first practical adaptive filter with minimal adaptivity overhead and strong adaptivity guarantees, which means that the performance and false-positive guarantees continue to hold even for adversarial workloads. The \sysname is based on the state-of-the-art quotient filter design and preserves all the critical features of the quotient filter such as cache efficiency and mergeability. Furthermore, we employ a new auxiliary structure design which results in considerably low adaptivity overhead and makes the \sysname practical in real systems. We evaluate the \sysname by using it to filter queries to an on-disk B-tree database and find no negative impact on insert or query performance compared to traditional filters. Against adversarial workloads, the \sysname preserves system performance, whereas traditional filters incur 2× slowdown from adversaries representing as low as 1% of the workload. Finally, we show that on skewed query workloads, the \sysname can reduce the false-positive rate 100× using negligible (1/1000th of a bit per item) space overhead.
more » « less
Full Text Available
BYO: A Unified Framework for Benchmarking Large-Scale Graph Containers

https://doi.org/10.14778/3665844.3665859

Wheatman, Brian; Dong, Xiaojun; Shen, Zheqi; Dhulipala, Laxman; Łącki, Jakub; Pandey, Prashant; Xu, Helen (May 2024, Proceedings of the VLDB Endowment)

A fundamental building block in any graph algorithm is agraph container -- a data structure used to represent the graph. Ideally, a graph container enables efficient access to the underlying graph, has low space usage, and supports updating the graph efficiently. In this paper, we conduct an extensive empirical evaluation of graph containers designed to support running algorithms on large graphs. To our knowledge, this is the firstapples-to-applescomparison of graph containers rather than overall systems, which include confounding factors such as differences in algorithm implementations and infrastructure. We measure the running time of 10 highly-optimized algorithms across over 20 different containers and 10 graphs. Somewhat surprisingly, we find that the average algorithm running time does not differ much across containers, especially those that support dynamic updates. Specifically, a simple container based on an off-the-shelf B-tree is only 1.22× slower on average than a highly optimized static one. Moreover, we observe that simplifying a graph-container Application Programming Interface (API) to only a few simple functions incurs a mere 1.16× slowdown compared to a complete API. Finally, we also measure batch-insert throughput in dynamic-graph containers for a full picture of their performance. To perform the benchmarks, we introduce BYO, a unified framework that standardizes evaluations of graph-algorithm performance across different graph containers. BYO extends the Graph Based Benchmark Suite (Dhulipala et al. 18), a state-of-the-art graph algorithm benchmark, to easily plug into different dynamic graph containers and enable fair comparisons between them on a large suite of graph algorithms. While several graph algorithm benchmarks have been developed to date, to the best of our knowledge, BYO is the first system designed to benchmark graph containers.
more » « less
Full Text Available
Vector Quotient Filters: Overcoming the Time/Space Trade-Off in Filter Design

https://doi.org/10.1145/3448016.3452841

Pandey, Prashant; Conway, Alex; Durie, Joe; Bender, Michael A.; Farach-Colton, Martin; Johnson, Rob (July 2023, SIGMOD)

Full Text Available
Communication Optimization for Distributed Execution of Graph Neural Networks

https://doi.org/10.1109/IPDPS54959.2023.00058

Kurt, Süreyya Emre; Yan, Jinghua; Sukumaran-Rajam, Aravind; Pandey, Prashant; Sadayappan, P (May 2023, IEEE)
Using molecular straps to engineer conjugated porous polymer growth, chemical doping, and conductivity

https://doi.org/10.1039/D3SC00983A

Mohanan, Manikandan; Ahmad, Humayun; Ajayan, Pooja; Pandey, Prashant K.; Calvert, Benjamin M.; Zhang, Xinran; Chen, Fu; Kim, Sung J.; Kundu, Santanu; Gavvalapalli, Nagarjuna (May 2023, Chemical Science)

Controlling network growth and architecture of 3D-conjugated porous polymers (CPPs) is challenging and therefore has limited the ability to systematically tune the network architecture and study its impact on doping efficiency and conductivity. We have proposed that π-face masking straps mask the π-face of the polymer backbone and therefore help to control π–π interchain interactions in higher dimensional π-conjugated materials unlike the conventional linear alkyl pendant solubilizing chains that are incapable of masking the π-face. Herein, we used cycloaraliphane-based π-face masking strapped monomers and show that the strapped repeat units, unlike the conventional monomers, help to overcome the strong interchain π–π interactions, extend network residence time, tune network growth, and increase chemical doping and conductivity in 3D-conjugated porous polymers. The straps doubled the network crosslinking density, which resulted in 18 times higher chemical doping efficiency compared to the control non-strapped-CPP. The straps also provided synthetic tunability and generated CPPs of varying network size, crosslinking density, dispersibility limit, and chemical doping efficiency by changing the knot to strut ratio. For the first time, we have shown that the processability issue of CPPs can be overcome by blending them with insulating commodity polymers. The blending of CPPs with poly(methylmethacrylate) (PMMA) has enabled them to be processed into thin films for conductivity measurements. The conductivity of strapped-CPPs is three orders of magnitude higher than that of the poly(phenyleneethynylene) porous network.
more » « less
Full Text Available
An incrementally updatable and scalable system for large-scale sequence search using the Bentley–Saxe transformation

https://doi.org/10.1093/bioinformatics/btac142

Almodaresi, Fatemeh; Khan, Jamshed; Madaminov, Sergey; Ferdman, Michael; Johnson, Rob; Pandey, Prashant; Patro, Rob; Boeva, ed., Valentina (March 2022, Bioinformatics)

Abstract MotivationIn the past few years, researchers have proposed numerous indexing schemes for searching large datasets of raw sequencing experiments. Most of these proposed indexes are approximate (i.e. with one-sided errors) in order to save space. Recently, researchers have published exact indexes—Mantis, VariMerge and Bifrost—that can serve as colored de Bruijn graph representations in addition to serving as k-mer indexes. This new type of index is promising because it has the potential to support more complex analyses than simple searches. However, in order to be useful as indexes for large and growing repositories of raw sequencing data, they must scale to thousands of experiments and support efficient insertion of new data. ResultsIn this paper, we show how to build a scalable and updatable exact raw sequence-search index. Specifically, we extend Mantis using the Bentley–Saxe transformation to support efficient updates, called Dynamic Mantis. We demonstrate Dynamic Mantis’s scalability by constructing an index of ≈40K samples from SRA by adding samples one at a time to an initial index of 10K samples. Compared to VariMerge and Bifrost, Dynamic Mantis is more efficient in terms of index-construction time and memory, query time and memory and index size. In our benchmarks, VariMerge and Bifrost scaled to only 5K and 80 samples, respectively, while Dynamic Mantis scaled to more than 39K samples. Queries were over 24× faster in Mantis than in Bifrost (VariMerge does not immediately support general search queries we require). Dynamic Mantis indexes were about 2.5× smaller than Bifrost’s indexes and about half as big as VariMerge’s indexes. Availability and implementationDynamic Mantis implementation is available at https://github.com/splatlab/mantis/tree/mergeMSTs. Supplementary informationSupplementary data are available at Bioinformatics online.
more » « less

« Prev Next »

Search for: All records